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Abstract. We study the statistical characteristics of a box-fitting algorithm to analyze stellar photometric time 
series in the search for periodic transits by extrasolar planets. The algorithm searches for signals characterized 
by a periodic alternation between two discrete levels, with much less time spent at the lower level. We present 
numerical as well as analytical results to predict the possible detection significance at various signal parameters. 
It is shown that the crucial parameter is the effective signal-to-noise ratio — the expected depth of the transit 
divided by the standard deviation of the measured photometric average within the transit. When this parameter 
exceeds the value of 6 we can expect a significant detection of the transit. We show that the box-fitting algorithm 
performs better than other methods available in the astronomical literature, especially for low signal-to-noise 
ratios. 

Key words, methods: data analysis - stars: variables - stars: planetary systems - occultations 
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1. Introduction 

A considerable fraction of the periodic astronomical time 
series can be modeled rather accurately by finite sums 
of sinusoidal components. In general, these Fourier-sums 
have a single dominant component, and therefore the 
basic method of Discrete Fourier Transformation (DFT) 
has become commonplace in almost all applications (e.g., 
Deeming 1975). When the signal becomes distorted by 
higher harmonics (e.g., light curves of fundamental mode 
RR Lyrae and S Cephei stars), this simple approach fails 
to perform properly, due to leakage of the signal power to 
many higher harmonics. One way to deal with this prob- 
lem is to use a multifrcqucncy Fourier fit for better approx- 
imation of the signal shape and thereby to increase the al- 
gorithm efficiency, a method recently suggested by Defay, 
Deleuil & Barge (2001) for the search for extrasolar plane- 
tary transits. Another generally accepted approach is the 
so-called Phase Dispersion Minimization (PDM), which 
searches for the best period that yields the 'smoothest' 
folded time series. 

Application of variants of the PDM method in the 
analyses of variable star observations goes back to ear- 
lier times than that of the DFT. This is primarily be- 
cause the PDM algorithm does not require the computa- 
tion of trigonometric functions, which put a heavy load 
on the computers, especially in those early days. The 
most frequently cited implementation of the PDM idea 



is that of Stellingwerf (1978). However, earlier versions 
had appeared already in the '60s and early '70s, like those 
of Lafler & Kinman (1965, hereafter the L-K method), 
Jurkevich (1971) and Warner & Robinson (1972, here- 
after the W-R method) . Actually, it can be shown that up 
to a frequency- (or trial period-) independent constant, 
the method of Jurkevich (1971) is equivalent to that of 
W-R (see, Kovacs 1980). Furthermore, without the ad- 
ditional feature of overlapping bin structure, the method 
of Stellingwerf (1978) is equivalent to that of Jurkevich 
(1971). 

The study of the algorithm presented in this paper has 
been stimulated by the increasing interest in searching for 
periodic transits by extrasolar planets (e.g., Gilliland et al. 
2000; Brown & Charbonneau 2000; Udalski et al. 2002), 
which follow the discovery of the transit of HD 209458 
(Charbonneau et al. 2000; Henry et al. 2000) by its plan- 
etary companion (Mazeh et al. 2000). Due to the short 
duration of the transit relative to the orbital period (typi- 
cally less than 5%), the signal expected is extremely non- 
sinusoidal. Considering the shallowness of the transit (typ- 
ically less than 2% in the case of Jupiter-size planets) and 
the expected high noise level of the ground-based small 
telescopes capable of performing large-scale surveys (e.g., 
Borucki et al. 2001), we suggest an algorithm that utilizes 
the special form of the signal. The algorithm studied here 
(see also Gilliland et al. 2000 and Udalski et al. 2002) is 
based on direct Least Squares (LS) fits of step functions 
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to the folded signal corresponding to various trial peri- 
ods. We present numerical simulations as well as analyti- 
cal considerations to estimate the ability of the algorithm 
to detect a faint signal in a noisy time series. We show 
that the algorithm performs significantly better and more 
efficiently than the published variants of PDM, DFT or 
some LS modification of the latter. 

2. The box-fitting algorithm 

We assume a strictly periodic signal, with a period Pq, that 
takes on only two discrete values, H and L. The time spent 
in the transit phase L is qPo , where the fractional transit 
length q is assumed to be a small number (w O.Of — 0.05). 
For any given set of data points, the algorithm aims to 
find the best model with estimators of five parameters — 
Po, q, L, H and to, the epoch of the transit. Actually, 
if we assume the average of the signal is zero, we have 
H = —Lq/{\ — q), and the number of parameters of the 
model is reduced to four. 

Let us denote the data set by {xi}, i = 1,2, ...,n. 
Each Xi includes an additive zero-mean Gaussian noise 
with (Tj standard deviation. The noise is presented by 
assigning to each data point a weight to,, defined as 
w i = cr r 2 [Ej=i a J ^ is further assumed that {wiXi} 
have a zero arithmetic average. 

For a given trial period we consider a folded time series, 
which is a permutation of the original time series. This 
series is denoted by {ii} and the corresponding weights 
by {wi}. We fit a step function to the folded time series 
with the following parameters: 

— L — the level in [ii, i 2 ] 

— H — the level in [1, ii) and (i 2 , n] 

The relative time spent at level L is characterized by r = 
y^Lj Wi, i.e., by the sum of the weights of the data points 
at level L. 

For any given (11,12), we minimize the expression 



V= ^w l (x l -H) 2 + M^-Hf 



i=l 



i=i 2 + l 



2J u>i(xi - L) 2 



(1) 



Minimization of T> yields simple weighted arithmetic av- 
erages over the proper index regimes 

L = i , H = -j^~ , (2) 
r 1 — r 

where 

With these formulae, the average squared deviation of the 



fit becomes 



Once this expression is evaluated, one has to repeat 
the computation with other (11,12) values and find the 
absolute minimum of T> for any given period. The first 
term on the right hand side of Eq. (4) does not depend 
on the trial period, and therefore one can use the second 
term alone to characterize the quality of the fit. We define 
the Box-fitting Least Squares (BLS) frequency spectrum 
by the amount of Signal Residue of the time series at any 
given trial period: 



SR = MAX 



l {hM) 



r(h,i 2 )[l - r(ii,i 2 )] 



(5) 



WiXi 



r(l — r) 



(4) 



Here, the maximization goes over the values i\ = 
1,2, ...,n*, while the i 2 values satisfy the inequality 
Ai mi „ < i 2 - %i < Aimax, where Ai min/max are de- 
termined by the minimum/maximum fractional transit 
length suspected to be present in the signal. The maxi- 
mum lower index n* depends on i 2 — i\ and covers the 
range of [n - Ai max ,n- Ai min }. 

The most obvious meaning of SR follows immediately 
from Eqs. (1) and (4). At the maximum value of SR, T> of 
Eq. (4) is related to the average variance of the noise. By 
using the definition of 8 = H — L for the transit depth, 
and the corresponding estimates of H and L of Eq. (2), 
we find that SR at the correct test period yields also an 
estimate of 8, i.e., SR = 8y/r(l — r). 

In a practical implementation of the above procedure^], 
we suggest to divide the folded time series into m bins and 
evaluate SR by using these binned values. This approach 
is very efficient computationally, and yields an exact LS 
solution with a time resolution defined by the number of 
bins. Although the lower resolution affects the efficiency of 
the signal detection, in all interesting cases a good compro- 
mise can be made between computational constraints and 
the effectiveness of signal detection (sec next section). One 
can further minimize the amount of computation by rec- 
ognizing that for a given period each s(i±, i 2 ) and r(ii, i 2 ) 
can be obtained by adding the corresponding bin values 
to the already evaluated functions of smaller i\ and 12- 

As an introductory example of the power of the BLS 
algorithm, we show in Fig. 1 the result of the analysis of 
a test signal with very low signal-to-noise ratio. Here and 
in all subsequent simulations we assume all data points 
have the same noise level, characterized by a, and there- 
fore the signal-to-noise ratio (SNR) is defined by 8 /a. The 
figure demonstrates that the BLS spectrum is able to iden- 
tify the correct period even at a high noise level. Several 
high peaks appear in the spectrum at integer fractions and 
multiples of the true period. This feature is common in all 
methods utilizing higher harmonics of the signal. In this 
example we used 50 bins, which is a reasonable compro- 
mise between computational efficiency and signal resolu- 
tion (see next section for details). In plotting the folded 
time series we used 100 bins. Here and in all subsequent 
figures the full frequency band is divided into w 1000 bins 

1 A fortran'77 version of the BLS algo rithm is accessible 
at http:// www. konkoly. hu/staff /kovacs. htrni 
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Fig. 1. An example of the power of the BLS method. The 
time series, the normalized BLS frequency spectrum and 
the folded time series are shown in the upper and lower two 
panels, respectively. The signal parameters are displayed 
in the header (see text for details). 

and only the maxima in these bins are plotted. In this way 
all important information is retained and showing unnec- 
essary details is avoided. The final spectra are normalized 
in the [0, 1] interval. 



components of the true period from the 10 th subhar- 
monics up to the 3 rd harmonics. Usually we use 4000 
frequency steps, because, due to the specific signal shape, 
the line profiles are very narrow. Therefore, our algorithm 
requires a much finer sampling than DFT (see also 
Stellingwerf 1978). 

To characterize the Signal Detection Efficiency we in- 
troduce (see also Alcock et al. 2000): 



SDE = 



SR, 



peak 



- (SR) 



sd(SR) 



(6) 



where SR pea k is the SR at the highest peak, (SR) is the 
average, and sd(SR) is the standard deviation of SR over 
the frequency band tested. Because in the practical com- 
putation of SDE one uses all available spectral points, in 
the presence of periodic signal, the actual value of SDE 
also depends on the time spanned by the data and on the 
lengths and position of the frequency band of the analysis. 
If all other parameters are kept constant, increasing time 
span or frequency band leads to an increase in SDE for 
signals containing periodic component (s). This is because 
under the condition mentioned, the relative contributions 
to (SR) and sd(SR) of the peaks associated with the sig- 
nal become smaller. Of course, aliasing leads to a decrease 
in SDE. 



3. Properties of the Box-Fitting algorithm 

This section focuses on the applicability of the Box-fitting 
algorithm to different time series, and to give signal- and 
noise-dependent confidence limits. Due to the statistical 
complexity, most of the results are based on extended 
numerical tests. Nevertheless, whenever possible, we also 
present some analytical approximations. 

In all subsequent tests the signals have a period of 5 d 
and span a timebase T of 60 d . The signal is sampled at 
times U — (i — 1 + $)T/n, where i9 is a uniformly dis- 
tributed random variable in [0, 1]. This distribution corre- 
sponds to the timings we may obtain during a short but 
concentrated and continuous observational campaign from 
several ground-based observatories or from space. If the 
sampling or test periods are different from the ones used 
in this paper, the main results presented here still remain 
valid, assuming that the distribution of the data in the 
folded time series at any trial period is basically uniform. 
Our non-periodic sampling yields aliasing-free spectra in 
the frequency band of interest. We do not deal with alias- 
ing effects originating from nearly periodic sampling, be- 
cause this problem affects all period searching algorithms 
in a similar way. 

In all our simulations, the search for the fractional 
transit length is performed in the [0.01,0.10] range. The 
upper limit is well above the expected maximum fractional 
transit length for planets (see Defay et al. 2001), but is in 
the right range for detached binaries. 

The frequency spectra are computed in the 
(0.02, 0.91)d _1 band, a range that contains frequency 



3.1. Dependence on the transit phase 

In the practical implementation of BLS one has to satisfy 
two conflicting requirements: (a) a high time resolution 
— demanding a large number of bins; (b) short execution 
time and statistical stability — supporting a small number 
of bins. For example, assuming a fractional transit length 
of 2%, one would like to have at least 50 bins, otherwise 
the transit signal will be included in a wider bin, degrad- 
ing the signal. However, 50 bins in such a case are not 
enough, because in most cases the transit will be divided 
into two bins, causing the same effect. For a more secure 
coverage we would probably need to double or triple the 
number of bins. However, this could be rather time con- 
suming, especially if the allowed range of fractional transit 
length is assumed to be large (say > 0.1). The sensitive 
dependence of the execution time on the number of bins 
follows from the fact that the number of operations in the 
BLS algorithm is proportional to m(Ai max — Az m i„), and 

When using a finite bin size, we expect some depen- 
dence of the BLS spectra on the transit phase, i.e., on 
the position of the transit within the folded time series. 
We illustrate this effect in the case of a noiseless signal in 
Fig. 2. The figure shows the strong increase of the power 
in the harmonic and subharmonic components, in addition 
to similar changes in other substructures of the frequency 
spectrum. 

In order to have a better idea of the possible ranges 
of SDE for different numbers of bins, we show in Fig. 3 
the SDE as a function of the fractional transit phase 
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Fig. 2. BLS spectra computed for two different transit 
phases for the same number of bins and signal parameters, 
shown in the header. 
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Fig. 3. Dependence of the SDE of the BLS spectra on the 
transit phase for noiseless signals with parameters shown 
in the header. 

At/ (qPo), where At denotes the shift in time of the start- 
ing moment of the transit with respect to some arbitrary 
epoch. The transit length is chosen so that at m = 50 the 
bin size is about half the transit length. The figure shows 
that even this number of bins displays relatively large vari- 
ations of SDE as a function of the transit phase, let alone 
the m = 20 case. The high number of bins clearly yields 
high, more stable SDE. However, the necessary increase 
in the CPU time is rather high (see later). On the other 
hand, the low number of bins may yield a fairly low SDE, 
due to occasional partial coverage of the transit and wide 
bin size. The periodicities in SDE come from the equidis- 
tant bin distribution and are given by (mq)" 1 in units of 
qPo- 

The SDE dependence on the transit phase is a func- 
tion of the transit length. To show this function we plot 
in Fig. 4 the same dependence for a transit length of 
q = 0.015, instead of 0.035 as in Fig. 3. We see now that 
the SDE corresponding to m = 20 is stable, although 



n= 3000 q= 0.015 6=0.010 6/a= °° 
: m=20(") 50 (o) 400 (*) 
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M/(qP ) 

Fig. 4. As in Fig. 3, but for a signals of shorter fractional 
transit length. 

with lower values. This is so because for a small number 
of bins the full transit is included in one bin in most cases 
- yielding SR values independent of the transit phase. 
The strong dependence on the transit phase occurs when 
the bin size is comparable to the actual length of the tran- 
sit. In both examples, a high enough number of bins yields 
higher, and more stable SDE, which, in the presence of 
noise, means also a higher probability of signal detection 
(see Sect. 3.3). 

3.2. Response to pure noise 

One of the most important features of any search algo- 
rithm is its ability to distinguish between false and true 
signals. In our case, this translates to the estimation of 
the statistical significance of the highest peak in the BLS 
spectrum. To study this question we performed inten- 
sive numerical tests to derive the Probability Distribution 
Function (PDF) of SDE in case the input data contain 
only Gaussian noise. 

Our results show that the PDFs depend only on the 
number of trial frequencies n./, and are immune (within 
the numerical stability of the simulations) to changing the 
number of data points or number of bins used by the BLS 
algorithm. In order to ensure reasonable numerical stabil- 
ity, we use in all cases 1500 realizations to estimate the 
corresponding PDF. The number of data points is set to 
500, 1000, or 2000, while the number of bins is chosen to 
be 50 or 100. Fig. 5 displays the empirical PDFs obtained 
for various numbers of trial frequencies. 

One could expect the probability of finding outstand- 
ing peaks in a noisy spectrum to increase with the number 
of sampled frequencies, if the samples were statistically 
independent. We derive a theoretical PDF by using this 
assumption, and show that adjustment of this PDF to the 
empirical one is possible, with parameters different from 
the ones predicted by the assumption of large independent 
samples. 
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Fig. 5. Probability distribution functions of SDE for the 
BLS method at various numbers of test frequencies shown 
at the horizontal lines. Thick lines are for the empirical 
(numerical), thin lines are for the semi-theoretical results 
described in the text. 

The computation of SDE consists of two major steps: 

(a) selection of the largest SR{ii,i2) at each trial period, 

(b) selection of SR pea ^ — the largest SR of all the val- 
ues computed for the n/ trial frequencies. In general we 
have mnf ^> n, and therefore it is obvious that many of 
the tested SR(ii, 12) values cannot be considered to be in- 
dependent samples. Therefore, in this — admittedly not 
exact, but, as we shall show below, quite practical — ap- 
proach, it is assumed that the distribution of SDE can 
be approximated by the one obtained for n independent 
samples of SR(ii,i2), where h is to be determined by nu- 
merical simulations. 

Assume, for the sake of simplicity, that all data points 
have the same error, o. We take h independent values of 
the random variable SR(i\,i2), identify the highest value 
SRpcuk and compute SDE. For a large sample size we 
can assume the values of (SR) and sd(SR) are constant. 
Therefore, one can use Eq. (6) to write the probability 
that SDE exceeds a specified value X: 

P(SDE > X) = P(SR pcak > x) , 

where x — X x sd(SR) + (SR). Let p be the probability 
that a given sample of SR has a value larger than x. Then, 

P(SDE >X) = P(SR pcak >x) = l-(l- pf. 

The value of p can be calculated under the assumption of 
pure noise. In such a case SR is actually the absolute value 
of a zero-mean Gaussian random variable with a variance 
of <r 2 /n. This distribution implies sd(SR) = aa/y/n and 
(SR) = ba/^/n~ where a = yjl - 2/tt = 0.60 and b = 
\plpK = 0.80. To facilitate the calculation of p we can use 
the normalized random variable \ — SRy/n/a and write: 

p = P(SR > x) 
= P(x > xy/H/a) 



= 2(1 - *(xy/n/o)) 
= 2(1 - $(aX + b)) 



(7) 



where <3? is the commulative distribution function of the 
normalized Gaussian variable. If all the n/ samples were 
independent, we could use n/ instead of n. Since this is 
not the case, in order to generalize the above calculation 
we assume that the effective h is related to n/ by some 
power law: h = n c ^. The parameters a and b depend on 
the assumption of constant standard deviation and mean 
of SR as well as on its specific distribution. Therefore, in 
order to fit the empirical PDFs we allow the three param- 
eters a, b and c to vary. The best-fit values we get are: 
a = 0.67, b = 0.36, c = 0.83. These values are significantly 
different from the ones obtained with our simplifying as- 
sumptions. However, as Fig. 5 shows, the functional form 
of the above PDF with the fitted parameters gives a good 
approximation of the numerical results. 

3.3. Signal detection power 

Before turning to the numerical simulations, we derive a 
simple estimate of the minimum number of data points 
necessary obtain a high enough signal detection probabil- 
ity. Here we deal with the folded time series at the signal 
period, and ask the question whether the differences in 
SR 2 (ii,i 2 ) = s 2 (ii,i 2 )/{r(ii,i2)(l-r(ii,i 2 ))] between the 
values including or excluding the transit event are signifi- 
cant. We therefore evaluate the Dip Detection Efficiency: 

E[SR 2 (j uj2 )}-E[SR 2 (k 1 ,k 2 )} 



DDE 



(8) 



where the indices j, k refer to the 'in' and 'out' of transit 
values, respectively, and E[...], <r 2 [...] stand for the expec- 
tation value and variance of the corresponding random 
variable. If we assume equal noise for all data points and 
optimal bin size (i.e., r = q), then the averages and stan- 
dard deviations read: 



E{SR 2 (i 1 ,i 2 ) 



S + - 

n 



+ 2- 



(9) 



a 2 [SR 2 (i u i 2 )] = 45* - 

n 

where So denotes the corresponding noise-free value of 
SR 2 (ii,i 2 ), with i = j, So = q(l — q)8 2 for the 'in transit' 
and i = k, So = S 2 q 3 /(1 — q) for the 'out of transit' cases. 
By employing the q <C 1 condition and retaining only the 
leading terms in q, and omitting the quartic terms in cr, it 
is easy to get the final expression for DDE 



DDE 



A G 



Based on this result, we use 



Q 



nq 



(10) 



(11) 



to parameterize the effective SNR, because a/^/nq is the 
standard deviation of the average of all measurements 
within the transit. 
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Fig. 6. Signal detection efficiency for noisy signals as a 
function of a = ^^fnq. The same bin number of 100 and 
fractional transit length of 0.033 are used in all simula- 
tions. 

We performed 1500 simulations with m = 100, S = 
0.01, q = 0.033, n = 1000-4000, and various transit 
phases. The dependence of SDE on a is shown in Fig. 6. 
We see that at a < 5 the SDE values are noise-dominated 
at about constant level, almost exclusively below 6. The 
SDE values generated by pure noise fall also under 6 
with very high probability (see Fig. 5). In the range of 
a = 6 — 13, SDE displays a sharp linear increase, until it 
reaches a mild maximum. It turns out that when a gets 
to the linear increase region, the associated value of SDE 
starts to become significant. 




10 20 30 40 50 

a 



Fig. 7. Differential signal detection efficiency for noisy sig- 
nals as a function of a. Shaded circles and filled squares 
show results for A(SDE) — SDE m=2 oo — SDE m - 2 o, and 
SDE m —Qo — SDE m= 2Q, respectively. The transit length is 
the same in all cases and is chosen to be relatively small, 
i.e., q = 0.015. 

Tests performed with other numbers of bins show that 
the above pattern remains the same, with the sole differ- 



ence that for a larger number of bins SDE reaches larger 
maxima, and therefore, the linear region between a = 6 
and 13 becomes steeper. It is important to note that the 
region of a around 6 is critical in all cases, because of the 
separation between the stochastic and deterministic signal 
detections. As we could expect, this regime coincides with 
the fa 3d limit of DDE (see Eq. (10)). 

In the above simulations we fixed the fractional transit 
length at a relatively large value, in order to ensure rea- 
sonable resolution with 100 bins. With a shorter transit 
length, at the same number of bins, the a > 15 region 
becomes fuzzier, with a somewhat higher average value 
of SDE (at least for moderately smaller transit lengths). 
However, the global properties remain the same as de- 
scribed above. 

The response of SDE to changing the number of bins 
is illustrated in Fig. 7. We see that a substantial increase 
of SDE can be gained in many cases by moving from 50 
bins to 200 bins. Of course, for longer transit lengths, the 
gain is smaller. 

4. Comparison with other methods 

The purpose of this section is to illustrate that the method 
introduced in this paper enables us to discover periodic 
transit-type events in noisy time series with a (much) 
higher probability than the other standard period search- 
ing algorithms available in the current astronomical liter- 
ature. In all subsequent examples we use m = 200 for the 
BLS method. 
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Fig. 8. As in Fig. 6, but for the W-R method with 100 
bins. 

Perhaps the most competitive method is that of W- 
R (or similarly, PDM of Stellingwerf 1978). Therefore, we 
perform the same test for this method as the one presented 
in Fig. 6 for the BLS method. The result is shown in Fig. 8. 
For compatibility, we use the same bin number and sig- 
nal parameters as for the simulations shown in Fig. 6. In 
comparison, we see that the W-R method yields a wider 
transition region between the noise- and signal-dominated 
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regimes. Furthermore, the value of SDE is lower in the 
signal-dominated region. In order to illustrate the appear- 
ance of these differences in the frequency spectra, we show 
a representative example in Fig. 9. We use 100 bins in the 
W-R method, because a larger number leads to an even 
poorer performance of this method. 



n=2000 P =5.0 q= 0.025 6=0.010 6/q=4.0 



n=2000 P =5.0 q= 0.025 6= 0.010 6/q= 1.0 
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Fig. 9. Comparison of the BLS and W-R methods for a re- 
alization of a signal with parameters shown in the header 
(other parameters are standard, as given in Sect. 3). 
The uppermost panel shows the folded/binned time se- 
ries (dots) with the period and the fit (continuous line) 
obtained by the BLS method. 

The next example examines the L-K method. We recall 
that in this method the squared differences between the 
bin-averages are computed for each folded time series. For 
large bin numbers and smoothly varying time series, this 
method should yield very small (in the limiting case, zero) 
L-K statistics at the correct period. Certainly, in the case 
of periodic signals with discontinuous variations, such as 
the ones studied in this paper, the L-K method should give 
non-zero statistics even in the noiseless case. Indeed, even 
for very high SNR, the L-K method performs extremely 
poorly (see Fig. 10). We use 500 bins for the L-K method, 
because tests have shown that at lower values the L-K 
method performs even more poorly. By decreasing SNR 
to 2-3, there remains no significant dip close to the test 
frequency (or to its harmonics) in the L-K spectrum. It is 
important to recall that this noise level still corresponds 
to a = 14-21, well above the secure signal detection limit 
for BLS. 

As expected, DFT is also not preferable for analyz- 
ing signals with periodic transits of short duration. In the 
noiseless case, DFT yields very slowly decreasing power 
for the successive harmonics. Although this is not a good 
property for correct period identification, DFT shows a 
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Fig. 10. As in Fig. 9, but for the L-K method. On the 
ordinate axis for the L-K method, SD stands for 'standard 
deviation', the corresponding statistics of the L-K method. 
Minima of SD indicate periodicities in the signal by the 
the L-K statistics. 
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reasonable stability against noise. As shown in Fig. 11, 
some remnant power at (or close to) the original harmonics 
is still visible in the DFT spectrum, but a reliable identifi- 
cation is not possible even at this high SNR (a = 11). At 
an even higher SNR, corresponding to a — 14, the main 
component becomes the highest amplitude peak, with a 
simultaneous increase of the harmonics and with a still 
considerable contribution from other parts of the spec- 
trum, originating from the noise. 
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Fig. 12. As in Fig. 9, but for the the Fourier-sum-fitting 
Least Squares method with 10 components. Minima in the 
FLS spectrum indicate periodicities in the signal. 

Finally, one can attempt to use multifrequency LS 
Fourier fit (FLS) for better approximation of the signal 
shape and thereby increasing the SDE in the case of the 
Fourier method. (We note that for data distribution lead- 
ing to orthogonal Fourier base, the method of Defay et al. 
(2001) is equivalent to this direct LS Fourier approach.) 
Because of the substantial increase in execution time for 
this method, we use a smaller number of data points and 
limit the order of the Fourier fit to 10. By plotting the 
standard deviation of the residuals, we obtain the result 
shown in Fig. 12. Although the performance of FLS could 
be improved by using more harmonics, FLS did, in gen- 
eral, a considerably poorer job at moderately high noise 
levels (similarly to the one shown in Fig. 12). 

Computational efficiency of the BLS method has been 
tested in comparison with DFT on a SUN workstation. For 
proper comparison, the same data sets and the same num- 
ber of frequency steps were computed by both methods, 
although DFT requires about an order of magnitude fewer 
steps, because of its wider line profiles. The range of the 
search for the best fractional transit length in the BLS 
method was fixed at [0.01,0.10], as mentioned at the be- 
ginning of Sect. 3. Table 1 shows that, except for high bin 
numbers and low number of data points, BLS surpasses 
DFT in execution time per frequency step. While the CPU 
time required by DFT is determined by the number of 
data points, BLS depends much less on this parameter. 
The BLS execution time can be decreased by decreasing 
the allowed range of fractional transit length. For example, 
with n — 8000, m = 500 the execution time decreases to 
13 s, if [r m i n , r max ] = [0.005, 0.03]. Note that by increasing 
the SDE of the Fourier method with the aid of a multi- 
frequency least squares approach (or its nearly equivalent 



version using sum of the amplitudes of the harmonics — 
see Defay et al. 2001) the execution time will increase 
substantially (proportional to the number of harmonics) 
for the Fourier method. A multifrequency Fourier method 
must use frequency steps as small as for the BLS method, 
while it has to use at least 7-10 harmonics to approxi- 
mate reasonably closely the SDE of the BLS method. We 
therefore suggest that the BLS method can be considered 
a very efficient tool to analyze transit-type signals. 

Table 1. Comparison of the execution times 
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^DFT / ^BLS 


1000 
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100 
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200 


4.0 


1.6 




500 


19.1 


0.3 
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50 


4.3 


7.6 




100 


5.0 


6.6 




200 


7.6 


4.3 




500 


22.6 


1.5 


8000 


500 


25.0 


2.1 



Notes: — 5000 test frequencies are computed 

— a SUN Ultra 300 MHz machine is used with 
an optimized FORTRAN compiler 



5. Conclusions 

This paper has examined the statistical characteristics of 
the Box-fitting Least Squares algorithm to detect periodic 
transits in time series of stellar photometric observations. 
The algorithm strongly relies on the anticipated box-shape 
of the periodic light curve. The advantage of using a pre- 
determined shape of the light curve manifests itself in the 
high efficiency of this method relative to the other search 
methods, which are generic and can detect any periodic 
variation. 

The algorithm studied here assumes only two levels 
of the periodic light curve. This assumption ignores all 
other features that are expected to appear in planetary 
transits. Thus, we ignore the gradual ingress and egress 
phases of the transit, which carry important information 
about the parameters of the planetary orbit (e.g, Sackett 
1999). The lengths of these phases are short compared 
to the transit and thus they are not expected to affect 
significantly the results of the search. Another effect we 
ignore is the limb-darkening effect, which has indeed been 
shown to be small in the case of HD 209458 (e.g., Deeg et 
al. 2001). The effectiveness of the algorithm relies on the 
above simplifying assumption, which is justified as long as 
we are interested in a detection tool. After the periodicity 
is detected we can try to recover subtle features of the 
folded light curve, in order to derive the stellar and the 
planetary characteristics. 

Our main interest is in cases where the signal-to-noise 
ratio is small, and one cannot identify the signal by moni- 
toring a single transit, because the stellar drop in intensity 
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is buried in the noise. Contrary to the search of transits by 
the HST in 47 Tuc (Gilliland et al. 2000), where the noise 
was small relative to the expected transit dip, we concen- 
trate on cases in which the periodic signal can be detected 
only after many measurements are accumulated and the 
unknown transit is monitored many times. To be able to 
deal with a large number of observations, of a thousand 
or more, we have introduced binning into the folded data. 
We have shown that as long as the bin size is small com- 
pared to the expected transit length, the efficiency of the 
method is not affected. 

One additional factor that determines the computa- 
tional load of the algorithm is the range of transit length 
searched for. The maximum possible transit length can 
be estimated if we know the orbital, stellar and planetary 
radii. For a given stellar mass, the stellar radius can be de- 
rived by the mass-radius relation, and the orbital radius 
can be derived for any period. Recent theories give some 
estimates for the planetary radii. Therefore, for a given 
stellar mass we can estimate the maximum duration of 
the transit, which for HD 209456 is only a few percent of 
the period. For most ground-based and space searches for 
planetary transits one would have some idea of the stellar 
mass of all transit candidates, and therefore we can make 
our algorithm computationally more efficient by imposing 
a variable maximum duration on the transit length. 

The significance of the detection depends primarily on 
the effective signal-to-noise ratio of the transit. The sig- 
nal is the stellar brightness within the transit, relative to 
the brightness outside the transit, and the noise is the 
expected scatter of the measured average of the stellar 
brightness inside the transit. The scatter is composed, ob- 
viously, of the observational noise as well as of stochastic 
variation of the stellar intensity. It seems that the effec- 
tive signal-to-noise ratio should exceed 6 in order to get 
a significant detection. This requirement should be taken 
into account when planning future searches for extrasolar 
planetary transits. 
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